perm filename NOTES.PRJ[LSP,JRA]6 blob sn#254384 filedate 1976-12-25 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00007 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	.SEC(Projects)
C00009 00003	.SS(Pretty-printing,,P226:)
C00019 00004	.SS(Syntax-directed processes,syntax-directed)
C00021 00005	.SS(Syntax directed computation,syntax-directed,P4:)
C00022 00006	.SS(One-pass assemblers)
C00024 00007	.SS(A theory project)
C00033 ENDMK
C⊗;
.SEC(Projects)
This chapter consists of a set of non-trivial projects which either apply LISP 
or extend LISP by adding new language features.
.SS(Extensions to %3eval%*,%3eval%*,P67:)
.SELECT 1;


.BEGIN TURN ON "#";
This next extension to %3eval%*  was derived from the syntax of 
⊗→MUDDLE↔←⊗↑[Mud#75]↑, ⊗→CONNIVER↔←⊗↑[Con#73]↑, and
⊗→MICRO-PLANNER↔←⊗↑[Mic#71]↑.
We have seen that LISP calling sequences are of two varieties: either
evaluate %2all%* of the arguments; or  evaluate %2none%* of the
arguments.

In an attempt to generalize this regime we might allow the evaluation of some
of the arguments and enlarge on the domain
of objects which can appear in the list of λ-variables.
We might partition the formal parameters into required parameters, optional 
parameters, and an excess collector to handle any actual parameters left over.
Required parameters %2must%* have corresponding actual parameters; optional
actual parameters are used if present, otherwise declared default 
values are used.
If there are more actual parameters than the formals encompassed by the first
two classes, then they are associated with the excess collector.

To be more precise consider the following possible BNF equations:

.BEGIN TABIT1(13);TURN OFF "←";

<varlist>\::=[<required> <optional> <excess>]

<required>\::= <par>; ...;<par> | %cε%* ⊗↓The symbol "%cε%1" stands for the empty string.←

<optional>\::= "optional" <opn>; ...; <opn> | %cε%*

<excess>\::= "excess" <par> | %cε%*

<par>\::= <variable> | %9`%*<variable>

<opn>\::= <par> | <par> ← <form>

.END
.GROUP;
The associated semantics are as follows:

.BEGIN INDENT 0,10;TURN OFF "←";
%21.%*  The formal parameters are to be bound to the actual parameters from
left to right  as usual.

%22.%*  There must be an actual parameter for each required parameter, and
if there is no excess collector there may not be more actual parameters than
formals. (There may be fewer if we have optionals.)

%23.%*  If a <variable> in a formal parameter is preceded by a "%9`%*", then
the corresponding actual parameter is %2not%* evaluated. This is just the
%3quote%1-ing %3read%1 macro.

%24.%*  We might run out of actual parameters while binding the optionals.
If we do, then we look at the remaining formal optionals.
If a formal parameter is simply a <par> then we bind it to %3(#)%*; if a formal is
%9`%1<variable> ← <form> then we bind the <variable> to the <form>; 
and if the formal is
<variable> ← <form>, we bind <variable> to the value of <form>, where the
evaluation is to take place %2after%1 the required parameters have been bound.

%25.%*  Finally, the excess collector is bound to a list of any remaining
actual parameters in the obvious way: if <par> is <variable> then 
using the calling environment, form a list
of the values of the remaining arguments; 
if it is %9`%1<variable>, bind <variable> to the
actual list. If there is no excess, bind to %3NIL%*.
.END
.APART

.BEGIN TURN OFF "←";
We will also extend %3prog%*-variables slightly, allowing 
them to be initialized explicitly. If a %3prog%*-variable is atomic,
intialize it to %3(#)%*, as usual. If it is of the form <variable>#←#<form>
then initialize it to the value of the <form>.
.END
.END

.BEGIN TURN OFF "←";turn on "\"; tabs 10,18;
Here are some examples:

1. In the initialization of %3length%* on {yon(P45)}, we could write:
%3...#prog[[l#←#x;#c#←#0]#....%*

2. %3list%* could now be defined as: %3λ[[%1"excess"%* x]x]%*.

3. Consider the following definition:
.BEGIN NOFILL;
.group

\%3baz <= λ[\[x;%9`%*y;%1"optional"%* z; u ← 0; %1"excess"%* v]
\\print[x];
\\print[y];
\\print[z];
\\print[u];
\\print[v] ].
.APART
.GROUP
%1Then a call of:

.BEGIN CENTER;
%3eval[(BAZ 2 (CAR (QUOTE (X Y))) 4 5 6 7 (CAR (QUOTE (A . B))));NIL]%* 
.END

would print:%3 \\2  
\\(CAR(QUOTE (X Y)))
\\4
\\5 
\\(6 7 A)%*

and return value: %3(6 7 A)%*.
.APART

.GROUP
Similarly, defining:

\%3fii <= λ[\[x;y;%1"optional"%* z; u ← 0; %1"excess"%3 v]
\\print[x];
\\print[y];
\\print[z];
\\print[u];
\\print[v]].

.BEGIN CENTERit;
%1and calling: ←%3eval[(FII 2 (CAR (QUOTE (X Y)));NIL]%* 
.END
prints:%3\\2 
\\X 
\\NIL 
\\0 
\\NIL.%*
.END
.END

.CENT(Problems);

Design simple S-expr representations of these proposed constructs.
Make these extensions to %3eval%*.
.SS(Pretty-printing,,P226:)
This project expands on the basic notion of "pretty-printing" which
was introduced on {yon(P245)}⊗↓Pretty printers are called
"grinders" at MIT.←. The content of this section introduces
several possible considerations involved in designing asuch a pretty
printer. The intent of this section is for you to take these suggestions and
develop a suitable program based on the suggestions and your experience
with your locally available prettry printer.
The formats described in this section were extracted form ⊗↑[Gol#73]↑.
.GROUP;

I. %2Linear format%1: The minimal acceptable output format is that produced by %3print%1.
Acceptability is judged by whether that output can be read back into the
machine and have a structure %3equal%1 the the structure printed out⊗↓We
must hedge that a bit, since %3gensym%1 atoms are not  placed on the
object list.←.
.APART
.GROUP;

II. %2Standard format%1: Given a list %3(%7a b c %1...%7 d%3)%1 the standard format
will assume that we are trying to print a function application and will producce:
.BEGIN GROUP;TABIT2(20,23);

\%3(%7a\b
\\c
\\ %1. . .%7
\\d%3)
.END
Thus %3(FOO (CAR (CONS (QUOTE A) B)) (G (H A) 4))%1 would produce:

.BEGIN GROUP;TABS 20,25,28,37;NOFILL;TURN ON "\";

\%3(FOO\(CAR (CONS\(QUOTE A)
\\\\B))
\\(G\(H A)
\\\4))
.END

Note that the "standard format" is recursively applied, and thus 
may become too wide for the output device. It that case we can resort to the 
following format.
.APART
.GROUP;

III. %2Miser format%1: Write a list %3(%7a b c %1...%7 d%3)%1 as:
.BEGIN GROUP;TABIT2(20,22);

\%3(%7a
\ b
\ c
\  %1. . .%7
\ d%3)
.END
Again, the recursive application of this format can overflow the output
width. In that case we may have to resort to  "linear format".
.APART;
.GROUP;

Typical pretty printers also recognize certain LISP constructs
and "grind" them differently. That is, we build some semantic knowledge into the
grinder. Block format is useful in many of these contexts.

IV. %2Block format%1: A list %3(%7a%41%1#...#%7a%4n-1%1,#...#%7a%4n%3)%1
has  the block format:
.BEGIN TABIT2(20,21);

\%3(\%7a%41%1#...#%7a%4n-1%1
\\%7a%4n%1#...#%7a%4n%3)%1
.END

For  example, the list representation of a %3prog%1 might be "ground"
with its %3prog%1 variables in block format and special indenting
in the %3prog%1#body to set off the labels.
The representation of %3length%1 given on {yon(P250)} illustrates
the special format for %3prog%1s, though the %3prog%1#variable list is
not sufficently long to require block format.
.APART
.GROUP;

Another list format which is treated specially is the representation of
a λ-expression. 
.BEGIN GROUP;TABIT2(20,31);
\%3(LAMBDA\%1<λ-list ground in block format>
\\<body ground as a block>%3)
.END
The example on {yon(P250)} also illistrates this.
This format will allow multiple-bodied λ-expressions.
For example:
.BEGIN SELECT 3; NOFILL;TURN ON "\";TABS 20,30,37
\(LAMBDA\(X Y Z)
\\(CONS\X 
\\\(QUOTE A))
\\(H 1 2))
.END
.APART
Notice that we decided to write %3(H 1 2)%1 in linear format
rather than standard format; somehow it "looks better". Personal
taste plays a  strong role in pretty printing, so several
grinders give  the users the ability to  describe their own
formats. We will see a similar, but more general device
in {yonss(P105)}.  Another possibility for user extension
lies with our property list evaluator of {yonss(P237)}.
Part of the definition of the various  LISP constructs would
allow the specification of output conventions for instances
of those constructs;  thus besides specifying evaluation properties
for %3LAMBDA%1, %3PROG%1, and %3COND%1, we would also  the
grinding routines for outputting instances of those constructs.

.GROUP;
Since lists beginning with %3COND%1 are representations of
conditional expressions they too are handled specially
by the grinding routines.

.BEGIN TABIT2(20,27);
\%3(COND%1\<grind clause%41%1>
\\  . . .
\\<grind clause%4n%1>%3)
.END
.APART

These selected formats should give a reasonable idea of the techniques available
for pretty printers. More techniques can be extracted from your local grinder.

.GROUP;
Your pretty printer may assume the existence of %3patom%1, %3print%1, and
%3terpri%1 of {yonss(P13)}; and you may assume the existence of
the usual class of arithmetic operations. In addition, the following
formatting primitives may be used:

.BEGIN INDENT 0,10;GROUP;
%21.%1 %3linelength%1: If %3linelength%1 is called without an argument
then  the value returned is the number of characters allowed on a
line of the current output device. If %3linelength%1 is called with
a numeric argument, then the line length of the  output device is set to that
number.
.END
.APART


.BEGIN INDENT 0,7;GROUP;
%22.%1 %3chrct%1: This is a nullary function, and returns
a number representing the number of character positions remaining on the current
line. For example, just after %3terpri[]%1 has been executed
%3chrct[]#=#linelength[]%1, and 
%3prog%42%3[patom[ABC];#difference[linelength[];chrct[]]]#=#3%1.
.END

.BEGIN INDENT 0,10;GROUP;
%23.%1 %3flatsize%1:
A simple count of the
atoms and special characters in an expression won't give an accurate
picture of the requirements for printing an expression.
Special characters take one character position, but
literal atoms and numbers  may require more. 
To determine whether or not a special format can be used on an expression
requires knowledge of its character count. The number of characters
in the atom %3x%1 is given by %3flatsize[x]%1;
%3flatsize[ABCD]%1 is %34%1. Actually %3flatsize%1 is defined
for non-atomic  arguments, giving the number of character positions
required to %3print%1 its argument. For example %3flatsize[(A.B)]%1
is %37%1 rather than %35%1 since %3print%1 will surround the
dot with spaces.
.END
.SS(Syntax-directed processes,syntax-directed)
%1
.REQUIRE "NOTES.SDP" SOURCE_FILE;
.SS(Syntax directed computation,syntax-directed,P4:)

compilation for sae

BNF for mexprs

syntax directed compiler

scanner parser

general syntax directed computation
.SS(One-pass assemblers)
.P10:


Write a one-pass assembler for the code generated by the %3compile%* function
of this section. You should be aware of the following points:

.BEGIN INDENT 0,5;

%2a.%*  %3QUOTE%*d expressions must be protected from garbage collection. The
simplest way to accomplish this it to  save them on a list, say %3QLIST%*.

%2b.%*  Use the operation codes of {yonss(P7)})

%2c.%*  Design a simple fixup scheme.  Notice that %3compile%*'s code will
require address fixups at most.

%2d.%*  Recall that the format of the code is a list. The items in the list are
either atomic -- representing labels --, or lists -- representing instructions--.
The instructions have varying format.  Perhaps a ⊗→table-driven↔← scheme can be used.

%2e.%*  Some attempt should be made to generate error messages.

%2f.%*  Many of the constants, %3(C n)%*, occur frequently; these constants
are only referenced, never changed by execution of the compiled code.  Write
your assembler to maintain only one copy of each. The constants should be stored
directly after the compiled code.  

%2f%*.  Try to be equally clever about storing %3QUOTE%*d expressions.
.END

.SS(A theory project)
The point of the following section is to introduce the computer science
student to some of the more theoretical areas of his province and
similarly to convince the mathematically inclined that there are
interesting and challenging problems concerning the foundations
of computer science. 

We will use our experience in LISP as a starting point. It perhaps
seems presumptuous to base a theoretical discussion on knowledge of a  
programming language, but as we will see there is much more to LISP than
its use in  symbolic programming. In particular, LISP can be used to introduce
the general area of recursion theory.  For example, one of the
fundamental results in recursion theory, is the proof of the non-existence
of an algorithm to determine whether or not a function was total.
The existence of such a function would certainly have made our life
more pleasant. Alas, a predicate, call it %3total%*, does not exist as we
shall now establish.

The proof depends on our knowledge about the function %3apply%*. The fundamental
relationship is:

.BEGIN INDENT 10,10;
For a function %3f%* and arguments %3a%41%1,#...#,%3a%4n%1 we know
that if 
.BEGIN CENTER;
%3f[a%41%3;#...;a%4n%3]%1 is defined then 
%3f[a%41%3;#...;a%4n%3]%1 =  %3apply[%eR%f(%3f%f)%3list[%eR%f(%3a%41%f)%3;#...;%eR%f(%3a%4n%f)%3];env]%1
.END
This omniscient property of %3apply%* makes it a %2⊗→universal function↔←%* for
LISP. That means that if %3apply%* is given an encoding of a function, of
some arguments to be applied, and an environment which contains 
the definition of %3f%* and all the
necessary subsidiary definition needed by %3f%*, then %3apply%*
can simulate the behavior of %3f%*. 

.END
For sake of concreteness in this 
discussion we will assume that the representation of %3env%* is 
a list of dotted pairs: representation of name dotted with representation 
of λ-expression. Given a function named %3g%*, together with its λ-definition
will designate the S-expr representation as %3g%8*%1.

Then if %3f%* uses %3f%41%1 through %3f%4n%1 as subsidiary
we will represent the environment as:
.BEGIN CENTER; SELECT 3;
list[f%8*%3;f%41%8*%3;...;f%4n%8*%3]
.END

Now assume the existence of a unary predicate %3total%* such that:
.BEGIN TABIT1(10);
\|gives %et%* if %3x%* is a representation of a total unary⊗↓This discussion 
will nominally concern unary functions; the generalization is obvious.←function.
%3total[x]%1 <
\|gives %ef%1 in all other cases.
.END

Notice that if %3total[list[f%8*%3;#...]%1 is true, then for arbitrary %3a%*
we will have %3apply[name[f%8*%3];list[a];#list[f%8*%3;#...]]%1 terminating and
giving value %3f[a]%*.

Now we define a function:
.BEGIN SELECT 3;CENTER;
diag[x] <= [total[x] → list[apply[name[first[x]];list[x];x]]; %et%* → %ef%*]
.END
This mild-mannered function is total. Let's form %3diag%8*%1:
.BEGIN SELECT 3;NO FILL;
(DIAG . 
   (LAMBDA (X)
           (COND ((TOTAL X) (LIST (APPLY (NAME (FIRST X))(LIST X) X)))
			   (T NIL))) )
.END
Finally form %3zowie%* = %3list[diag%8*%3; total%8*%3; apply%8*%3; ...]%1. That is
%3zowie%* will be the representation of %3diag%* and all its necessary
functions.

Evaluate %3diag[zowie]%*. Since %3diag %2is%1 total, then %3total[zowie]%* is
true, and we are faced with
.BEGIN CENTER; SELECT 3;

diag[zowie] = list[apply[name[first[zowie]];list[zowie];zowie]]
.END
but %3name[first[zowie]] = DIAG%1; and therefore the call on %3apply%* reduces
to 
%3apply[DIAG;list[zowie];zowie]%*. But that's just %3diag[zowie]%*, and we've
shown %3daig[zowie]#=#list[diag[zowie]]%1. That's just not possible. Thus the
assumption of the existence of %3total%* must be in error.

The typical proof of this result involves encoding the functions in the integers
and then expressing  the equivalent of the %3apply%* function as an algorithm in
number theory. 
The trick of encoding on the integers is analogous to what %2we%* did
in encoding in the S-expressions. 
The presentation in LISP is much more understandable. It's the old problem of
representation again. Mapping to number theory is too representation dependent;
we seldom need properties of the integers.

Thus LISP %2is%* more than a programming language; its a formalism for discussing
computation as well⊗↓Indeed, LISP was originally meant
to be a formalism for problems in Artificial Intelligence; 
S. Russell suggested to McCarthy that LISP could be turned into a 
language for a machine.McCarthy had doubts, but Russell prevailed.←. 


The recent
work in mathematical semantic of programming languages does indeed draw on
strong mathematical lines, but it must be remembered that the study is %2not%*
formal mathematics; it is not toploogy; it is not algebra, it is not recursion
theory. The ultimate justification for the studies come from  problems
in computer science, and it is to those areas that we should go for motivation.
This is not to say that mathematics departments must become dispensers of tools
for computer science departments, any more than they are resident experts for
the engineering schools. But this is an era when students are asking for 
"relevance"; computer science is an area which is full of relevance, full
of interesting questions best described and approached by sophisticated
mathematical and logical tools.